Probabilities and Frequency Distributions
POLS 3312: Argument, Data, and Politics

Tom Hanna

2024-01-31

Announcements

  • Grades in Canvas
  • Paper will be posted in Canvas for Monday. You do not need to read it in advance. We will go through it Monday as an example of how to read an academic paper. You DO NEED to have a copy either electronically or printed with you on Monday.

Overview

  • Basic overview of probability
  • Frequency Distributions

Probability

  • Probability is a measure of the likelihood of an event
  • Probability is a number between 0 and 1 (or 0 and 100%)
  • 0 means the event is impossible
  • 1 means the event is certain
  • 0.5 means the event is as likely as not

Finding probability

  • Probability is the ratio of the number of times an event occurs to the total number of trials
  • For example, if we flip a coin 10 times and get 5 heads, the probability of heads is 5/10 or 0.5
  • If we flip a coin 100 times and get 50 heads, the probability of heads is 50/100 or 0.5

Why do we use data?

  • Purpose: analyzing data for causal inference (to begin to make statements about cause and effect - inferring causes)
  • Complex and uncertain data requires that we make…

Assumptions about the data

  • Because the world is complex, to make sense of unknowns we make assumptions about data
  • The assumptions are useful approximations even when not preceisely true
  • We still need to check that the real data does not seriously violate the assumptions

Data Assumptions: Random, Independent, and Identically Distributed

  • Randomness and independence matter as assumptions about data
  • Specifically, these are assumptions about the Data Generating Process or DGP
  • The Data Generating Process: the way the world produces the data

The Data Generating Process

  • The source of the data matters - the DGP matters
  • Experiment vs observation are one way DGP matters
  • Previously stated: Data comes from a random world
  • So the DGP has a random element

Independence and Distribution

  • Events in the data are independent and identically distributed - the IID assumption

Independence and Distribution

  • Events in the data are independent and identically distributed - the IID assumption

  • Independence is statistical independence - the outcome of one event does not affect our belief about the probability of another event

  • We can draw a random number from a hat, then flip a coin. The hat draw does not affect the probability of the coin toss

Independence and causation

  • Falsifiability assumption: X does not affect Y

If X does appear to affect Y, we may begin to infer some direct or indirect causal relationship in some direction somewhere possibly through one or more additional variables, but not necessarily that X causes Y. This is commonly shortened to the not quite accurate summary “correlation does not imply causation.”

Independence and Distribution

  • Events in the data are independent and identically distributed - the IID assumption
  • Independence is statistical independence - the outcome of one event does not affect our belief about the probability of another event
  • Identically distributed: drawn from the same probability distribution

So…

Introduction to distributions

  • The most important is the normal distribution
  • This is because of the central limit theorem
  • We will look at these in the most detail: normal, binomial, uniform, poisson

Distribution examples

  • The following are histograms
  • They represent the frequency or simply the number count of observations for each value
  • For example, if the value 4 shows 500, it means there that 4 came up 500 times in the data
  • The graphs were produced by generating random numbers based on the particular distribution with an R function

Uniform distribution

All outcomes are equally likely

Uniform distribution

All outcomes are equally likely

Normal Distribution

  • symmetrical around its mean with most values near the central peak
  • width is a function of the standard deviation
  • Other names: Gaussian distribution, bell curve

Normal Distribution

Binomial Distribution

  • binary
  • success/failure
  • yes/no
  • distribution for a number of Bernoulli trials

Binomial example

  • n = 1 makes this a Bernoulli distribution

Binomial example

  • trials = 25

Preview of the Central Limit Theorem

What happens if we do the same thing above but do it 1,000 times and plot the counts?

Preview of the Central Limit Theorem

The Central Limit Theorem

  • For sufficiently large sample sizes, the distribution of sample means approximates a normal distribution
  • This means with a large enough number of trials, we can apply the normal distribution to know things about measures of central tendency, measures of dispersion, and probabilities
  • Sample sizes above 30
  • This is just a preview

68-95-99.7 Rule

  • One of the rules for normal distributions is:

The 68-95-99.7 rule

  • 68% of the data is within 1 standard deviation of the mean
  • 95% of the data is within 2 standard deviations of the mean
  • 99.7% of the data is within 3 standard deviations of the mean

The Law of Large Numbers

  • The law of large numbers tells us that if we repeat an experiment a large number of time, the average of the results will be close to the expected value
  • This allows us to apply the actual mean of the sample to the expected mean of the population

Poisson distribution

  • Count of number of events in a fixed time/space
  • Known constant mean rate of occurrence
  • Independent of time since last event

Poisson distribution

Why we can’t use standard OLS regression for other DGP

  • We base the likelihood of something being significant on the proximity to the mean
  • As things get further from the mean in a normal distribution, they become less likely

Why we can’t use standard OLS regression for other DGP

Why we can’t use standard OLS regression for other DGP

Authorship and License

Creative Commons License